feature(xjy): add multi-task learning pipeline in jericho environment#465
Open
xiongjyu wants to merge 9 commits intoopendilab:mainfrom
Open
feature(xjy): add multi-task learning pipeline in jericho environment#465xiongjyu wants to merge 9 commits intoopendilab:mainfrom
xiongjyu wants to merge 9 commits intoopendilab:mainfrom
Conversation
puyuan1996
requested changes
Feb 6, 2026
puyuan1996
requested changes
Feb 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
这个 PR 的主要目的是在 LightZero 的 Jericho 环境中引入多任务学习(Multi-task Learning)的训练pipeline,并修复了相关的适配问题。
主要变更包括:
新增多任务支持: 添加了针对 Jericho 环境的多任务学习配置(jericho_unizero_multitask_config.py)以及对应的 DDP 训练配置。
环境与收集器适配: 修复了 muzero_collector 在 episode 模式下的 bug,并使其能够适配多任务环境的收集需求。
DDP 训练修复: 修复了在 Jericho 多任务环境下 DDP(分布式数据并行)设置中的 bug。
修复了收集日志(collect log)输出时的 bug。
清理了未使用的配置代码,并优化了保存配置文件的命名。
实验结果:
训练曲线


数据收集曲线
detective

aconcourt

omniquest

zork1
